Aside

Download a PDF of this CV

Contact

Language Skills

R
Bash
Git/GitHub
Python

Disclaimer

Made with the R package datadrivencv and pagedown.

The source code is available on github.com/dzhang32/cv.

Last updated on 2021-03-09.

Main

David Zhang

During my PhD, I have developed and applied algorithms that integrate large-scale genetic and transcriptomic datasets to improve the diagnostics rate of rare disease patients. I’m passionate about developing and releasing robust, user-friendly software that empowers geneticists.

Education

Research assistant, part-time PhD, Bioinformatics

University College London

London, UK

Present - 2017

  • Thesis: Using transcriptomics to improve the diagnosis rate of rare disease patients.
  • The goal of my PhD is to develop and apply statisical methods and software that improve the genetic diagnosis rates using RNA-sequencing.

MSc, Neuroscience

University College London

London, UK

2016 - 2015

  • Thesis: The role of mitochondrial dysfunction in Xerodoma pigmentosum
  • Grade: Merit (68%)
  • Awarded post-graduate support scheme bursary (£10,000)

BSc, Biomedical science

University College London

London, UK

2015 - 2012

  • Thesis: Investigating the function of CYFIP1 in the development of rat hippocampal neurons.
  • Grade: 2:1 (69%)

H.S.

Queen Elizabeth’s School

Barnet, UK

2012 - 2007

  • Grade: Maths (A*), Biology (A*), Chemistry (A*), Sociology (A).



Research Experience

Honorary Researcher (2 months)

Johns Hopkins Bloomberg School of Public Health

Remote

2020

  • In collaboration with Leonardo Collado-Torres, we used the recount3 dataset and LIBD samples to study the effect of complex splicing in individuals with neurological disease.

Research Technician

University College London

London, UK

2017 - 2016

  • Used R and bash to investigate the effect of genetic variation on the age of onset of dementia and cognition within Down syndrome patients.



Industry Experience

Bioinformatician internship (3 months)

Verge Genomics

Remote

2020

  • Detection of aberrant splicing events in complex disease patients.
  • Used AWS infrastructure to analyse 100s of RNA-seq samples derived from patients with Parkinson’s disease and amyotrophic lateral sclerosis



Software & programming

Bioconductor packages

N/A

N/A

Present - 2020

  • dasper: detection of aberrant splicing events in RNA-sequencing. Author and maintainer. XXX downloads.
  • megadepth: BigWig and BAM related utilities. An R wrapper for the megadepth software developed. Co-author and maintainer. XXXX downloads.

Data science blog posts

N/A

N/A

2021

  • Published chess-related blogposts on Medium. Posts were curated by Towards Data Science and selected for their hands-on-tutorials column, which displays the pieces that highlight data science best practices.
  • Applied python through the analysis of chess.com data.

Advanced R

N/A

N/A

2021 - 2020

Kaggle town

N/A

N/A

2020

  • Organised club to study python and machine learning through tackling kaggle problems.

Data wrangling

Neuroimmunology & CSF Laboratory, NHS

London, UK

2018 - 2016

  • Developer and maintainer of data wrangling pipelines that improved the efficiency and standardisation of monthly financial reports.

Teaching Experience

Developing Bioconductor Packages

University College London

Virtual Event

2020

Unit testing using testthat edition 3

rstats club

Virtual Event

2020

  • Talk regarding unit testing fundamentals, the importance of testing and new features released in the R package testthat edition 3.

R fundamentals

Clinician Coders

London, UK

2020 - 2018

  • Developed materials and lead workshops that aimed to teach R fundamentals to clinicians.

RNA-sequencing for diagnostics

Kings College London

London, UK

2020 - 2017

  • Lectured graduate level students about how transcriptomics can be applied in the diagnostic pipeline.



Selected Publications

Megadepth: efficient coverage quantification for BigWigs and BAMs

Bioinformatics

N/A

2021

  • Wilks C, Ahmed O, Baker DN, Zhang D, Collado-Torres L, Langmead B. 2021. Megadepth: efficient coverage quantification for BigWigs and BAMs. Bioinformatics.
  • Role: R package developer.
  • DOI: https://doi.org/10.1101/2020.12.17.423317

Integration of eQTL and Parkinson’s disease GWAS data implicates 11 disease genes

Jama Neurology

N/A

2021

  • Kia DA, Zhang D, Guelfi S, Manzoni C, Hubbard L, United Kingdom Brain Expression Consortium (UKBEC), International Parkinson’s Disease Genomics Consortium (IPDGC), Reynolds RH, Botía JA, Ryten M, Ferrari R, Lewis PA, Williams N, Trabzuni D, Hardy J, Wood NW. 2021. Integration of eQTL and Parkinson’s disease GWAS data implicates 11 disease genes. Jama Neurology.
  • Role: Co-first author.
  • DOI: https://doi.org/10.1001/jamaneurol.2020.5257

Human-lineage-specific genomic elements: relevance to neurodegenerative disease and APOE transcript usage.

Nature Communications

N/A

2021

  • Chen Z, Zhang D, Reynolds RH, Gustavsson EK, Garcia-Ruiz S, D’Sa K, Fairbrother-Brown A, Vandrovcova J, International Parkinson’s Disease Genomics Consortium (IPDGC), Hardy J, Houlden H, Gagliano SA, Botiá J, Ryten M. Human-lineage-specific genomic elements: relevance to neurodegenerative disease and APOE transcript usage. Nature Communications.
  • Role: Analyst.
  • DOI: TBA

Incomplete annotation of disease-associated genes is limiting our understanding of Mendelian and complex neurogenetic disorders.

Science advances

N/A

2020

  • Zhang D, Guelfi S, Ruiz SG, Costa B, Reynolds RH, D’Sa K, Liu W, Courtin T, Peterson A, Jaffe AE, Hardy J, Botia JA, Collado-Torres L and Ryten M. 2020. Incomplete annotation of disease-associated genes is limiting our understanding of Mendelian and complex neurogenetic disorders. Science Advances.
  • Role: First Author.
  • DOI: https://doi.org/10.1126/sciadv.aay8299

Regulatory sites for known and novel splicing in human basal ganglia are enriched for disease-relevant information.

Nature Communications

N/A

2020

  • Guelfi S, D’Sa K, Botía J, Vandrovcova J, Reynolds RH, Zhang D, Trabzuni D, Collado-Torres L, Thomason A, Leyton PQ, Gagliano SA, Nalls MA, UK Brain Expression Consortium, Small KS, Smith C, Ramasamy A, Hardy J, Weale ME & Ryten M. 2020. Regulatory sites for known and novel splicing in human basal ganglia are enriched for disease-relevant information. Nature Communications.
  • Role: Analyst.
  • DOI: https://doi.org/10.1038/s41467-020-14483-x

Genetic variability in response to Aβ deposition influences Alzheimer’s risk.

Brain Communications

N/A

2019

  • Salih DA, Bayram S, Guelfi S, Reynolds RH, Shoai M, Ryten M, Brenton JW, Zhang D, Matarin M, Botia JA, Shah R, Brookes KJ, Guetta-Baranes T, Morgan K, Bellou E, Cummings DM, Escott-Price V, Hardy J. 2019. Genetic variability in response to Aβ deposition influences Alzheimer’s risk. Brain Communications.
  • Role: Analyst.
  • DOI: https://doi.org/10.1093/braincomms/fcz022

Duplication of 10q24 locus: broadening the clinical and radiological spectrum.

Eur J Hum Genet

N/A

2019

  • Holder-Espinasse M, Jamsheer A, Escande F, Andrieux J, Petit F, Sowinska-Seidler A, Socha M, Jakubiuk-Tomaszuk A, Gerard M, Mathieu-Dramard M, Cormier-Daire V, Verloes A, Toutain A, Plessis G, Jonveaux P, Baumann C, David A, Farra C, Colin E, Jacquemont S, Rossi A, Mansour S, Ghali N, Moncla A, Lahiri N, Hurst J, Pollina E, Patch C, Ahn JW, Valat AS, Mezel A, Bourgeot P, Zhang D, Manouvrier-Hanu S. 2019. Duplication of 10q24 locus: broadening the clinical and radiological spectrum. Eur J Hum Genet.
  • Role: Analyst.
  • DOI: https://doi.org/10.1038/s41431-018-0326-9

Genetic variation within genes associated with mitochondrial function is significantly associated with later age of onset of Parkinson disease and contributes to disease risk.

NPJ Parkinson’s Disease

N/A

2019

  • Billingsley KJ, Barbosa IA, Bandrés-Ciga S, Quinn JP, Bubb VJ, Deshpande C, Botía JA, Reynolds RH, Zhang D, Simpson MA, Blauwendraat C, Nalls MA, Singleton A, International Parkinson’s Disease Genomics Consortium (IPDGC), Ryten M, Koks S. 2019. Genetic variation within genes associated with mitochondrial function is significantly associated with later age of onset of Parkinson disease and contributes to disease risk. NPJ Parkinson’s Disease.
  • Role: Data provider.
  • DOI: https://doi.org/10.1038/s41531-019-0080-x

Variation at the TRIM11 locus modifies Progressive Supranuclear Palsy phenotype.

Annals of Neurology

N/A

2018

  • Jabbari E, John W, Tan MMX, Maryam S, Pittman A, Ferrari R, Mok KY, Zhang D, Reynolds RH, de Silva R, Grimm MJ, Respondek G, Muller U, Al-Sarraj S, Gentleman SM, Lees AJ, Warner TT, Hardy J, Revesz T, Hoglinger GU, Holton JL, Ryten M and Morris HR. 2018. Variation at the TRIM11 locus modifies Progressive Supranuclear Palsy phenotype. Annals of Neurology.
  • Role: Analyst.
  • DOI: https://doi.org/10.1002/ana.25308



Conferences

Genomics England Research Conference

N/A

London, UK

2019

  • Poster: Predicting disease-causing genes using machine learning

Genomics of Rare Disease

N/A

Cambridge, UK

2019

  • Poster: The use of transcriptomics to improve gene annotation
  • Poster: Using machine learning to understand and predict genes causing rare neurological disorders
  • Awarded prize for the best poster (£100)

International Parkinson’s Disease Genomics Consortium (IPDGC)

N/A

Lisbon, Portugal

2019

  • Talk: Incomplete annotation of disease-associated genes is limiting our understanding of Mendelian and complex neurogenetic disorders

European Society of human genetics (ESHG)

N/A

Milan, italy

2018

  • Poster: Incomplete annotation of OMIM genes is likley to be limiting the diagnostics yield from genetic tests.

International Parkinson’s Disease Genomics Consortium (IPDGC)

N/A

Reykjavik, Iceland

2018

  • Poster: Incomplete annotation of OMIM genes is limiting the diagnostic yield from genetic tests.

World Science Conference Israel (WSCI)

N/A

Jerusalem, Israel

2015

  • 1 of 11 UK participants chosen to attend.